Data Project 2: UV Spectral Line Analysis

Question One: Constructing a Spectrum for Analysis

1.1

I’ve put the dataset on Canvas, but it’s also good to see what the MAST is and what it can do for you (hint, it’s amazing!).

Here are the directions for downloading the data from MAST (please read even if you get data off of Canvas):

Go to the Space Telescope archive (http://archive.stsci.edu/mast.html) and download GHRS ECH-B spectra of the star HZ 43. What kind of a star is HZ 43? (Hint, consult SIMBAD). The GHRS dataset contains a series of .fits files. c0f.fits contains the wavelength solutions, c1f.fits contains the calibrated fluxes, *c2f.fits contains the uncertainties. Consult http://www.stsci.edu/documents/dhb/pdf/GHRS.pdf for details. There are two HST GHRS ECH-B datasets for HZ 43. Choose the one that contains the Mg II absorption lines near 2796 and 2803 ̊A. This dataset appears to have 4 identical copies available on the archive - two from HST online and two from DADS.

According to SIMBAD, star HX 43 is a White Drawf!

1.2

Look through the GHRS.pdf document to understand why the 16 subexposures in the dataset have four different sets of wavelength solutions (look in Section 35.6). Follow the recommended procedure and align the subexposures by cross-correlating the fluxes (here, don’t just use numpy’s or scipy’s correlate functions!) against the first exposure obtained, then shifting by the number of wavelength bins required to achieve maximum cross-correlation. You may want to co-add each set of 4 before cross-correlating to improve s/n. You should get more than 10 pixel shifts, and you have to think about what the correlation does, to do it properly. Eq 3.102 of Ivezic should help connect the integral definition in class to the discrete function.

To create the cross validation function in python, I will use the following equation which was pulled from wikipedia:

The idea of this function is to sweep the second flux array over the first flux array from right to left. The overlapping area between the two flux arrays is the cross correlation. Np.roll() inverts arrays. First I need to invert the first flux array then multiply each element of the first flux array by each element of the second flux array (sweeping across). Then need to sum over that mulitplication to get cross correlation.

To create the cross correlation function I worked with Anna.

The pixel shifts or number of wavelength bins required to ahcieve maximum cross-correlation are (17, 33, and 52).

1.3

Co-add (i.e., find the mean of) all aligned subexposures, using uniform weighting since presumably each of the sub-exposures has the same integration time. Calculate the uncertainty in your co-added spectrum using the given uncertainties for the individual spectra. Turn in a plot of the co-added spectrum, with the uncertainty values overplotted in some fashion (either along the bottom of the plot or plt.errorbar). Make a plot of the s/n as a function of wavelength. Make sure to label axes with the correct units.

Because I didn't want to latex my hand written work, see canvas dropbox submission for written work for this question.

1.4

Is the s/n between the two absorption lines consistent with the standard deviation of the flux values? What is the minimum percent change relative to the continuum between the lines at which you could confidently detect an emission or absorption line?

I am assuming that you want us to see if the s/n between the absorption lines is consistent with the mean +/- standard deviation of the flux values.
I found that the s/n between the absorption lines is consistent with the mean +/- the standard deviation of the flux values. I got the mean +/- the standard deviation of flux values to be 12.32 +/- 0.62. I got the s/n between the two absorption lines to be 11.66. Thus, yes the s/n between the two absorption lines is consistent with the mean +/- the standard deviation of the flux values.

The minimum percent change relative to the continuum between the lines at which I could confidently detect an emisson or absorption line is approximately 25.73%.

Because I didn't want to latex my hand written work, see canvas dropbox submission for written work for this question.

Question Two: Normalizing by the Continuum

2.1

Search for the interstellar absorption lines of Mg II (near 2796 and 2803 ̊A). What are the laboratory wavelengths of these lines? (See Morton, D. 1991, ApJS, 77, 119 for this and oscillator strengths (f values); also can use NIST database: https://physics.nist.gov/PhysRefData/ASD/lines_form.html and make sure to select the appropriate wavelength scale (air or vacuum).)

According to NIST database, the laboratory wavelengths in a vacuum scale of these absorption lines are 2796.352 ̊A and 2803.530 ̊A.

2.2

What is the s/n in the continuum on either side of these lines? What is the s/n in the absorption line? If the noise is dominated by Poisson noise from photon counts, what would you expect the relationship between fractional uncertainties in the continuum and the line to be? Check to see if your s/n ratio is consistent with a Poisson distribution.

The s/n for the continuum on the left side of the left MgII absorption line (line near 2796 ̊A) is 12.23. The s/n for the continuum on the right side of the right absorption line (line near 2803 ̊A) is 10.84. And, from part 1.4 we saw that the s/n for the continum for the right side of the left MgII absorption line (line near 2796 ̊A) and the left side of the ight absorption line (line near 2803 ̊A) is 12.09.

The s/n for the left asboption (near 2796 ̊A) is 5.19. The s/n for the right absoption line (near 2803 ̊A) is 7.07.

I expect the relationship/ratio between the fractional uncertainties in the continuum between the absorption lines and the left absoprtion line (near 2796 ̊A) to be 0.472. The s/n ratio I got for the continuum inbetween the two absorption lines and using the left absorption line (near 2796 ̊A) was 0.445. The values are pretty close, so yes, my s/n ratio is consistent with a Poisson distribtuion.

Because I didn't want to latex the formulas needed for this question (assuming noise is dominated from a poisson distribtuion, consider relationship between fractional uncertainties in the continuum and the line), see canvas dropbox submission for written work for this question.

2.3

Fit a low order polynomial (even just a straight line with a slope is fine) to the continuum, masking out the absorption lines. What are the uncertainties in your fit parameters? Turn in a plot of the spectrum with your fit to the background overplotted. (This could be a good chance to write up that design matrix algorithm I’ve been talking about, but a canned routine is fine too.).

Coni helped me figure out how to mask out the absoption lines (i.e. delete the points in the array that were at the absoption lines). I tried using np.remove(), as well as add arrays to left, right, and inbetween the absoption lines (but arrays werent of the same shapes), and more, but getting lots of errors that I couldn't debug quickly enough. She sent me her code and I used it as you said I could over email. I understand completely how the code works:

  1. Find the data points in the array that are to the right, inbetween, and left of the absorption lines.
  2. collecting the datapoints that satisfy the requirments of being to the left, right, or inbetween the two absorption lines
  3. Create 2 new arrays which are 1. the flux array with the absorption line values removed from the array and 2. the wavelength array with the absorption line values removed

I fit a degree one polynomial to the continuum which masked out the two absorption lines. Note that a degree one polynomial fit makes the model: y= slope*x+intercept. The parameter estimates with their corresponding uncertainities for this degree one polynomial fit are:
1. Intercept: -5.583116058824341e-13 +/- 2.832114711126018e-13
2. Slope: 2.8235668811163627e-16 +/- 1.0115119626338277e-16

2.4

Normalize the spectrum by dividing by your best fit continuum polynomial or line. The continuum should now be relatively flat, with a mean value of unity in the continuum. Turn in a plot of the normalized spectrum.

I confirm that my the continuum is relatively flat, with a mean value of unity in the continumum.

Question Three: Line Analysis

3.1

Fit each of the two absorption lines with a Gaussian. Report your best-fit values and uncertainties for the fit parameters, including the central wavelength, amplitude, and width. Are the single Gaussian fits acceptable? Turn in a plot of the normalized spectrum with the Gaussian fits overplotted.

To fit the gaussian I copied code from:
https://stackoverflow.com/questions/59047395/fitting-gaussian-to-absorbtion-line-in-python

The diagoanals are the uncertainties associated with each parameter, respectively.

I fit the two absorption lines with a Gaussian. The parameters used to fit the model were central wavelength (mean), amplitude, and width (standard deviation).

The parameter estimates for the second fit of the first absorption line (near 2796 ̊A) with their corresponding uncertainities are as follows:

1. Mean: 2.79629393e+03 +/- 0.0010641069824035552
2. Ampltiude: 7.83477723e-01 +/- 0.02573561565224349
3. Standard Deviation: 2.80868715e-02 +/- 0.0010829875899565978

The parameter estimates for the Gaussian fit of the second absorption line (near 2803 ̊A) with their corresponding uncertainities are as follows:
1. Mean: 2.80347993e+03 +/- 0.0018593463851579672
2. Amplitude: 6.19312544e-01 +/- 0.0400935179299597
3. Standard Deviation: 2.49001433e-02 +/- 0.0018673348494579111

Yes, the Gaussian fits of the absorption lines are acceptable. The plot of both of the two fits looks pretty good and the uncertainities are super small! I'm shocked at how well they fit the data.

3.2

Calculate the central velocity for each line, and the uncertainties in the central velocities. Use the convention that a positive velocity is away from the observer (careful!). Are the central velocities of these lines consistent with each other? Test the consistency by using a differenced χ2 consistency test, taking into account the uncertainties in the laboratory wavelengths of these lines.

The central velocity with its associated uncertainities for the first absorption line (near 2796 ̊A) is:

  1. -6209.12674621733 +/- 113.77966292471372 m/s

    The central velocity with its associated uncertainities for the second absorption line (near 2803 ̊A) is:
  2. -5340.074961801732 +/- 198.30163008107883 m/s

The central velocities for the two absorption lines are relatively differently, thus the central velocities for the two absorption lines are inconsistent with each other.

According to Morton 1991, the labatory wavelenths of these lines are 0.001.

Because I didn't want to latex my hand written work for propogation of errors, see canvas dropbox submission for written work for this part.

When testing the consistency of the two central velocities of the two absorption lines using a differenced $χ^2$ consistency test, taking into account the uncertainties in the laboratory wavelengths of these lines, I got the PTE=0.002 and $χ^2$ test statistic = 10.06. Based on lecture 18 notes, given that the PTE is low and the $χ^2$ test statistic is relatively high, we can assume that the two central velocities of the two absorption lines are inconsistent with each other or are different from each other.

Because I didn't want to latex my hand written work, see canvas dropbox submission for written work for this question.